CLASSICAL AND QUANTUM INFO-MANIFOLDS 



R. F. Streater, Dept. of Maths., King's College London, Strand, WC2R 2LS 



1 Estimation; the Cramer-Rao inequality 

O" 

Let Prj(x) be a probability density, depending on a parameter n € R. The Fisher information 
^ ■ of p v is defined to be 
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•Si C:=J Pv{x)[ ^ )dx . (!) 

We note that this is the variance of the random variable Y = d log p ri /drj, which has mean 
zero. G is associated with the family M = {p^} of distributions, rather than any one of 
them. This concept arises in the theory of estimation as follows. Let X be a random variable 
whose distribution is believed or hoped to be one of those in A4. We estimate the value of 
7] by measuring X independently m times, getting the data x\, . . . , x m . An estimator f is a 
function of (x±, . . . , x m ) that is used for this estimate. So X is a function of m independent 
copies of X, and so is a random variable. To be useful, the estimator must be independent 
of rj, which we do not (yet) know. We say that an estimator is unbiased if its mean is the 
desired parameter; it is usual to take / as a function of X and to regard /(ij),i = l,...,m 
as samples of /. Then the condition that / is unbiased becomes 

We use the notation p.f for the expectation of / in the state p. A good estimator should 
also have only a small chance of being far from the correct value, which is its mean if 
it is unbiased. This chance is measured by the variance. Fisher |8| stated, and Rao 
and Cramer proved, that the variance V of an unbiased estimator / obeys the inequality 
V > G . For the proof, differentiate eq. (pf) w. r. t. r\ to get 
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Pv-f '■= / Pr,{x)f{x)dx = Tj. (2) 



^mdx = i, 0) 



which can be written as 



f ( d log p \ 

Y(x)(f(x) - r])p v (x) dx = J \—q^- j (f( x ) ~ V) Pv( x ) dx = L ( 4 ) 
We note that this is the correlation of Y and /, so the covariance matrix becomes 

(5) 

This is positive semi-definite, giving the result. □ 
If we do iV independent measurements of the estimator, and average them, we improve 
the inequality to V > G^ 1 /N. This inequality expresses that, given the family p v , there is a 
limit to the reliability with which we can estimate rj. Fisher termed V/G~ l the efficiency of 
the estimator /. Equality in the Schwarz inequality occurs if and only if the two functions 
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are proportional. Let —d^/dn denote the factor of proportionality. Then the optimal 
estimator occurs when 



logp v (x) = - J d£/dri(f(x) - rj) drj. (6) 

Doing the integral, and adjusting the integration constant by normalisation, leads to 

Pv (x) = Z- l exp{-Zf(x)} (7) 

which is the 'exponential family'. 

This can be generalised to any n-parameter manifold M. = {p v } of distributions, n = 
(771 , . . . , 7] n ) with 7] € R n . Suppose we have unbiased estimators (/1, . . . , f n ), with covariance 
matrix V . Fisher introduced the information matrix 

a " = / ^x) a y g y & . ( 8) 

We note that Y 3 := dlog p/dr/j is a random variable with zero mean, and that G lJ is its 
covariance matrix. Rao remarked that G %3 provides a Riemannian metric for A4. We now 
derive the analogue of the inequality when n > 1. Put Vij = p v -[(fi — Vi)(fj ~ Vj)]i t ne 
covariance matrix of {/»}. Differentiate the condition for being unbiased, 



p v (x)fi(x)dx = rn (9) 
with respect to r/j, and rearrange as above, to get 

Pll {x)Y\x){f 3 (x)-r l3 )dx = 5 l3 . (10) 



This is the correlation between Y % and fj. The covariance matrix of the 2n random variables 
Y % , fj therefore is 

(?£)■ 

This is therefore a positive semi-definite matrix. If it is not definite, it has zero as an 
eigenvalue, which leads to GV = /, and the manifold must be the exponential family, as 
before. If it is definite, so is its inverse, which is found to be 



(G-V- 1 )- 1 -G-^V-G- 1 )- 1 
-V-^G-V- 1 )- 1 (V-G- 1 )- 1 



u-i _n-i I- (12) 



It follows that the leading submatrices (G — V 1 ) 1 and (V — G 1 ) 1 are positive definite, 
and thus so are their inverses. It follows that we get the matrix inequality V > G -1 . 



2 Entropy methods, exponential families 

Gibbs knew that the state of maximum entropy, given the mean energy, is the canonical 
state. More generally, let Q be a countable sample space, and let S denote the set of 
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probabilities (or states) on f2. Let fi,---,f n be n linearly independent random variables, 
whose means we can measure. We want to find the 'best' choice for the the state, given 
these means. The least prejudiced choice of p (Jaynes) is to maximise the entropy S subject 
to the n + 1 constraints given by normalisation and the means of /,-, j = 1, . . . , n. We use 
A, £ J as Lagrange multipliers; then we must maximise 

n 

by varying p(uj) subject to no constraints. We get 

P ({u) = Z- l exp-{J2efj(u)} where Z = £ exp{- £ £''/»}. (13) 

3 w 3 

These make up the exponential manifold M determined by T := Span {/i, . . . , f n } and 
parametrised by these are called the canonical coordinates on j\4, which has 

dimension n. At least one, say /i, must be bounded below, to ensure Z < oo holds for some 

£■ 

The £ J are determined by the given expectation values by the conditions p^-fj = rjj, 
j = 1, . . . ,n. The r]j are thus also coordinates for the manifold (the mixture coords.) It is 
easy to show that 

d*f/ drjj 
Vj = ~W i = 1 > V 3k = —Q7k, j,k = l,...,m, (14) 

where ^ = log Z, and that ^ is a convex function of £ J . The Legendre dual to ^ is ^— CVi 
and this is the entropy S = —p. log p. The dual relations are 

e* = |^ g^ = -^. (is) 

or]j orik 

By the rule for Jacobians, V and G are mutual inverses. Therefore, the method of maximum 
entropy leads to the exponential family, which allows the optimisation of the Cramer-Rao 
bound, and gives us estimators of 100% efficiency. 

3 Manifolds modelled by Orlicz spaces 



Pistone and Sempi [21] have developed a version of information geometry, which does not 
depend on a choice of the span of a finite number of estimators. Let (Q, p) be measure 
space and let A4 be the set of all probability measures that are equivalent to p; such a 
measure is determined by its Radon-Nikodym derivative p relative to p. The topology on 
A4 is not given by the L 1 -distance, but by an Orlicz norm. 

Given p G M, the Cramer class at p is the set of all random variables X on (Q, p) such 
that the moment-generating function 

X p (t) := f e' tx pdp (16) 
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is finite in a 'hood of the origin. This is enough to ensure that it is analytic in an interval 
about t = 0. The Cramer class C p at a point p in A4 is furnished with the Luxemburg norm 

<l}- (17) 

The Cramer class C at p is an Orlicz space, and so is a Banach space with this norm. The 
centred Cramer class C(0) is defined as the subset of C at p with zero mean in the state 
p; this is a closed subspace. A sufficiently small ball in the quotient Banach space C/C(0) 
then parametrises a 'hood of p, and can be identified with the tangent space at p; namely, 
the 'hood contains those points a of A4 such that 

a = Z- 1 e~ x p for some X e C. (18) 

where Z is a normalising factor. Pistone and Sempi show that the bilinear form 

G(X,Y) = E p [XY] (19) 

is a Riemannian metric on the tangent space C/Cq, thus generalising the Fisher- Rao theory. 

This theory is called non-parametric estimation theory, because we do not limit the 
distributions to those specified by a finite number of parameters, but allow any 'shape' for 
the density p. It is this construction that we take over to the quantum case, except that 
the spectrum is discrete and the distributions are not always equivalent. 



X\\ p = inf 







i r > : E p 


cosh w) ~ 1 







4 Efron, Dawid and Amari 



A Riemannian metric G, eq. (15) gives us a notion of parallel transport, namely, that given 
by the Levi-Civita affine connection. Recall that an affine map, U (acting on the right) 
from one vector space T% to another, T2 , is one that obeys 

{XXU + (1 - X)YU) = XXU + (1 - X)YU, for all X, Y € T x and all A G [0, 1]. (20) 

The same definition works on an affine space, that is, a convex subset of a vector space. 
This leads to the concept of an affine connection. 

Let hA be a manifold and denote by T p the tangent space at p £ M. Consider an affine 
map Uy(p,cr) :T p ^T a defined for each pair of points p, a and each continuous path 7 in 
the manifold starting at p and ending at a. Let p, a and r be any three points and 71 a 
path from p to a, and 72 any path from a to r. 

Definition 1 We say that U is an affine connection, if Uq = Id and 

^7iU72 = ^71 ^72- (21) 

Let X be a tangent vector at p; we call XU~ n the parallel transport of X to u, along the 
path 71. 

We also require U to be smooth in p in a 'hood of the point p, when we identify a 
ball in the tangent space with part of the manifold by the exponential map. In physics it 
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is usually the differential of U along a specified direction that is called 'affine connection'. 
Equivalently, a connection defines a covariant derivative of a vector field on the manifold: 

VyX := d/dt XU 7 (p, 7(t))|t=o (22) 

where {"fit)}, < t < 1 is any path from p to a, which starts at p in the direction Y € T p . 
This is designed to convert vector fields to tensor fields. Conversely, a covariant derivative 
defines a connection. This concept allows us to specify that two tangent vectors to the 
manifold at points p and a are parallel if the parallel transport (along a specified curve) 
of one from p to a is proportional to the other. A geodesic is a self-parallel curve on M: 
the tangent vectors to the curve at different points are parallel, when transported along 
the curve. Geodesies relative to the Levi-Civita connection are lines of minimal length, as 
measured by the metric. 

Estimation theory might be considered geometrically as follows. For theoretical reasons, 
we expect the distribution of a random variable to lie on a submanifold .Mo Q -M of states. 
The data give us a histogram, which is a distribution, but not a pretty one. We seek 
the point on ftAo that is 'closest' to the data. Suppose that the sample space is f2, with 
< oo. Let us place all positive distributions, including the experimental one, in a 
common manifold, Ai. This manifold will have the Riemannian structure, G, provided 
by the Fisher metric. We then draw the geodesic curve through the data point that has 
shortest distance to the sub-manifold .Mo; where it cuts Mo is our estimate for the state. 
This procedure, however, does not always lead to unbiased estimators. Efron |?J and Dawid 
H noticed that the Levi-Civita connection is not the only useful one, and that there are 
others that might be used in estimation theory. First, the ordinary mixtures of densities 
pi, p2 leads to 

p = Api + (1 - X)p 2 , 0<A<1. (23) 

Done locally, this leads to a connection on the manifold, now called the (— 1)-Amari con- 
nection: two tangents are parallel if they are proportional as functions on the sample space. 
This differs from the parallelism given by the Levi-Civita connection. We need to use 
(— l)-geodesics to give unbiased estimates for /. 

There is another obvious convex structure, that obtained from the linear structure of 
the space of centred random variables, also known as the scores. Take po € M and write 
fo = — log po- Consider a perturbation p x of po, which we write as 

p x =Z^e~fo- x . (24) 

The random variable X is not uniquely defined by px, since by adding a constant to X, we 
can adjust the partition function to give the same px- Among all these equivalent X we 
can choose the score which has zero expectation in the state po : Po-X = 0. We can define 
a sort of mixture of two such perturbed states, p x and p Y by 

<\p x + (l-\)p Y > :=p xx+(1 _ X)Y . (25) 

This is a convex structure on the space of states, and differs from that given in eq. (|23|). 
It leads to an affine connection, now called the (+1)-Amari connection. How do these 
connections relate to the metric? 
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Definition 2 Let G be a Riemannian metric on the manifold A4 . A connection 7 1— ► U~ is 
called a metric connection if 

G a (XU- y ,YU 7 ) = G p (X,Y) (26) 
for all tangent vectors X,Y and all paths 7 from p to a. 

The Levi-Civita connection is a metric connection, but the (±) Amari connections are not; 
they are, however, dual relative to the Rao-Fisher metric; let 7 be a path connecting p with 
<r; then for all X, Y: 

G a (XU + (p,a),YU-(p,a)) = G P (X,Y). (27) 

Let V 1 * 1 be the two covariant derivatives obtained from the connections U^. Amari M 
defines intermediate covariant derivatives 

V« = I(l + a )V+ + i(l-a)V-. (28) 

These uniquely define connections, LJ( a \ whose dual relative to G is lJ(~ a \ The Levi- 
Civita covariant derivative is the case a = 0, which is self-dual and therefore metric, as is 
known. Amari shows that define flat connections without torsion. Flat means that 

the transport is independent of the path, and 'no torsion' means that U takes the origin of 
T p to the origin of T p around any loop; it is linear, and not a general affine map. In that 
case there are affine coordinates, that is, global coordinates in which the respective convex 
structure is obtained by simply mixing coordinates linearly. Amari shows that for q/±1, 
V a is not flat, but that the manifold is a sphere in the Banach space £ p , p = —a/2 + 1/2. 
In particular, the case a = leads to the unit sphere in the Hilbert space L 2 , and the Levi- 
Civita parallel transport is vector translation in this space. The metric distance between 
measures is the Hellinger distance, and the natural coordinates are the square-roots of the 
densities, imitating the wave- functions of quantum mechanics. Similar results were obtained 
in infinite dimensions in || 10 1. 



In estimation theory, the method of maximum entropy for unbiased estimators makes 
use of the V - connection. This is true also in the dynamics of neural nets, dense liquids, 
Onsager theory, Brownian particles in a potential and the Soret and Dufour effects |23[ ]; 
the micro-state after a small time is replaced by a macrostate, which is the same as the 
max-entropy estimation of the state by one on the manifold generated by exponentials of 
the macrovariables (or, slow variables). The (intractible) microdynamics is continuously 
projected in a rolling construction onto the (easier) manifold of exponential states. This 



idea was proposed by Kossakowski [jig], Ingarden, et al. fll5 |, and beautifully expounded 
by Balian, et al. ||. The resulting non-linear dynamics can be described thus: after each 
time-step of the linear dynamics of the system, Nature makes the best estimate of the state 
among those lying on the manifold. 



5 The finite quantum info manifold 

Chentsov || asked whether the Fisher-Rao metric was unique. Any manifold has a large 
number of different metrics on it; apart from those that differ just by a constant factor, 
one can multiply a metric by a space-dependent factor. There are many others. Chentsov 
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therefore imposed conditions on the metric. He saw the metric (and the Fisher metric in 
particular) as a measure of the distinguishability of two states. He argued that if this is to 
be true, then the distance between two states must be reduced by any stochastic map; for, 
a stochastic map must 'muddy the waters', reducing our ability to distinguish states. He 
therefore considered the class of metrics G that are reduced by any stochastic map on the 
random variables. 

Definition 3 A stochastic map is a linear map on the algebra of random variables that 
preserves positivity and takes 1 to itself. 

Chentsov was able to prove that the Fisher-Rao metric is unique, among all metrics, being 
the only one (up to a constant multiple) that is reduced by any stochastic map. It is 
therefore uniquely defined up to this factor within the category of commutative function 
algebras, with stochastic maps as morphisms. 

In quantum mechanics, instead of the abelian algebra of random variables we use the 
algebra of matrices M n . Measures on f2 are replaced by 'states', that is, n x n density 
matrices. For convenience we limit discussion to the interior of the set of states; these 
are positive-definite matrices of trace 1, which are faithful states and invertible matrices. 
We take this set to be the manifold Ai; it is a genuine manifold, and not one of the 
non-commutative manifolds without points that occur in Connes's theory. The natural 
morphisms of the quantum info manifold are the completely positive maps that preserve the 
identity. Chentsov found that uniqueness of the metric is not true for quantum mechanics. 
(Actually, Petz completed the analysis after Chentsov died; see jl3|l ). 

As in the classical case, there are several affine structures on this manifold. The first 
comes from the mixing of the states, and is called the — 1-affine structure. Coordinates for 
a state p in a hood of po provided by p — po, a small traceless matrix. The whole tangent 
space at p is thus identified with the set of traceless matrices, and this is a vector space with 
the usual rules for adding matrices. Obviously, the manifold is flat relative to this affine 
structure. 

The + 1-affine structure is constructed as follows. Since a state po S At is faithful we 
can write Hq := — log po and any p near pq £ At as 

p = Zx 1 exp-(H + X) (29) 

for some Hermitian matrix X, which is ambiguous up to a multiple of the identity. We 
choose to fix X by requiring po-X = 0, and call X the 'score' of p. Then the tangent space 
at p can be identified with the set of scores, and the +l-linear structure is given by matrix 
addition of the scores. Corresponding to these two affine structures, there are two affine 
connections, whose covariant derivatives are denoted V'^'. Following Hasagawa [Q], one 
can also form interpolating affine structures from eq. (|2q). 

As an example of a metric on At , let p € Ai , and for X, Y in T p define the GNS metric 

by 

G P (X,Y) = ReTt[pXY}. (30) 
This metric is reduced by all cp stochastic maps F; that is, it obeys 

G F * P (XF, XF)) < G P (X, X), (31) 
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in accordance with Chentsov's idea. G is just the real part of the scalar product in the 
Gelfand-Naimark-Segal construction, and is positive definite since p is faithful. This has 



been adopted by Helstrom and others [14, 26, 18 1 in the theory of quantum estimation 



theory. However, Nagaoka [|l7j has noted that if we take this metric, then the (+1) and the 
(— 1) afhne connections are not dual; the dual to the (—1) affine connection, relative to this 
metric, is not flat and has torsion. This failure of duality is confirmed in [13]. 

In estimation theory we naturally seek a quantum analogue of the Cramer- Rao inequal- 
ity. Given a family Ai of density operators, parametrised by a real parameter r], we seek an 
estimator X whose mean we can measure in the true state p v . To be unbiased, we require 
Tr p„X = rj, which, as in the classical case gives 

^{PVP^(X-V)} = 1. (32) 

It is tempting to regard L r = p~ 1 dp/d r q as a quantum analogue of the Fisher info; it has 
zero mean, and the above equation says that its covariance with X — rj is equal to 1. The 
Schwarz inequality then leads to V(X) > [p v . (L*L r )] _1 , where we use p.X to denote Tr[pX]. 
For several estimators, the method used earlier gives this as a matrix inequality. 

However, p and its derivative do not (in general) commute, so Y is not Hermitian, and 
is not popular as a measure of quantum information. Helstrom, and Petz and Toth figf] get 
round this by using the idea of a logarithmic derivative. Let g be a real or complex scalar 
product on the space of matrices; we say that a matrix L is the g-logarithmic derivative of 
the family p^ if for any matrix X, 

^=9(L*,X). (33) 

The symmetric logarithmic derivative uses the real part of the GNS metric for g, so that 

^-Tr(p v X) = ~Tr[p v (L s X + XL S )]. (34) 

Another metric in Chentsov's allowed class is the Bogoliubov-Kubo-Mori metric; let X 
and Y have zero mean in the state p. Then put 

gJX, Y) = f 1 Tr \p a Xp 1 -° l Y] da. (35) 
Jo 1 J 

This is one of the family of scalar products found by Petz to obey the Chentsov property 
(a similar property was proved in [23|, with detailed balance replacing complete positivity). 



The corresponding logarithmic derivative, Lb, is defined such that 

^-p v .X = J' p^Lbp^X dX (36) 



dr] 

and is given explicitly by 

-i dp. 



Lb = i°° (A + p,?rl fif (A + p ^ ldX - (37) 



S 



Each metric leads to a Cramer- Rao inequality, also in matrix form for several estimators, 



and some of these are stronger than others [TL9L 20 



The BKM metric has other desirable properties, apart from entering in Kubo's 'theory 
of linear response'. For the metric g, the connections with covariant derivatives V( ±a ) are 
dual, and there are affine coordinates for V Q , namely, it is the unit sphere in the (finite-dim.) 
Banach space C p , the Schatten class with norm ||X||p = (Tr|X| p ) 1//p . The case p = 1/2, 
or a = 0, leads to the Hilbert space of Hilbert-Schmidt operators, which has been used in 
pj. More, the Massieu function logZ is the generating function for all the connected Kubo 
functions, and in particular, the mean is the first derivative, and the metric is the second, 
as in eq. (14). The entropy is again the Legendre transform of the Massieu function, and 
the reciprocal relations of eq. ( |T5| ) hold. It follows that the Cramer-Rao inequality for the 
BKM-metric is achieved exactly for the exponential family, agreeing with the method of 
maximum entropy. 



6 Araki's expansionals and the analytic manifold 

Araki j| has considered the case where p is a KMS state on a jy*-algebra. He then perturbed 
the state by adding bounded operators to the KMS Hamiltonian; the perturbed KMS state 
has a convergent Kubo-Mori perturbation expansion, which defines an analytic function 
in the Banach space of bounded perturbations. We [0] try to follow this for unbounded 
perturbations. 

Let E be the set of density operators on Tt, and let int S be its interior, the faithful 
states. We shall deal only with systems described by p € intS; this means that for a free 
Schrodinger particle, or system of such, we are limited to systems inside a finite volume of 
real space. Then we would expect the entropy to be finite. The following class of states 
turns out to be tractable. Let p £ (0, 1) and let C p , denote the set of operators C such that 
\C\ P is of trace class. This is like the Schatten class, except that we are in the bad case, 
< p < 1, for which C i— > (TrflCI^]) 1 ^ is only a quasi-norm. Let 

C<= (J C p . (38) 

0<p<l 

One can show that the entropy 

S(p) :=-Tr[plogp] (39) 
is finite for all states in C<. We take the underlying set of the quantum info manifold to be 

M = C< n intE. (40) 

We shall cover A4 with balls, each belonging to a Banach space, and shall show that we 
have a Banach manifold when Ai is furnished with the topology induced by the norms; for 
this, the main problem is to ensure that various Banach norms are equivalent. 

Let po E M and write Hq = — log po + cl. We choose c so that Hq > I, and we write 
Rq = Hq 1 for the resolvent at 0. We define a 'hood of po to be the set of states of the form 

p v = Z^exp-iHo + V), (41) 



9 



where V is a sufficiently small i^o-bounded form perturbation of Hq. The necessary and 
sufficient condition to be Kato-bounded is that 

||y||o:=K /2 ^o /2 |loo<oo. (42) 



The set of such V make up a Banach space, T(ti), with ( f42| ) as norm. The first result is 
that pv € M for V inside a small ball in T(0). For the proof, let a be the form-bound of 
V, and let q v be the form of Hq + V . Then we have for some b > 0, 

- bl + (1 - a)q < q V < bl + (1 + a)q . (43) 

Let L be any finite dimensional subspace of Domgo, and put 

A(g JJ L)=sup{?(^^):|H| = l, ip € L}. (44) 

Then the ordered eigenvalues of q are given by 

X(q, n) = inf{A(g, L) : dimL = n}. (45) 

From (|43] ) we have for each L, 

-b+{l-a)\(q ,n) <\{q v ,L). (46) 

Since X(qo,n) — > oo with n, the spectrum of iify is purely discrete. Thus 

exp/? (6 — (1 - o)A(go, n )) > exp — /3A(gy 5 (47) 

Summing over n gives the traces 

Tre' pHv < e 0(,b-^-a)Ho) 

which is of trace class for some /3 < 1 if a is small enough. 

We now consider [25| the special case when V is an /fo-bounded as an operator; the 
condition for this is ||i2oV|| < oo. Then V is also form-bounded, since 

||i?o /2 ^o /2 ||oo < \\R V\\oc < oo. (48) 

In this case we can use the larger norm to provide a topology. This is not equivalent to the 
topology we get using the norm (42); we are moving from po in a direction more regular 
than the general direction in the tangent space, and this allows us to furnish this slice of 
the manifold with a stronger topology. The state defined by V is given by 

Pv :=Zy l e* V -{H + V). (49) 

Thus, V and V + cl give rise to the same state; near po the regular directions in Ai are 
thus parametrised by the quotient space 

f = T/{d}. (50) 

We may therefore use the score, V — po-V, as coordinates for the 'regular' manifold, now 
using just the operator bounded perturbations. We show that these are displacements of 
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the state in analytic directions; in |11] we find a more general class of analytic directions, 
which together make up the 'analytic' manifold. This is an attempt to find the quantum 
analogue of the Cramer class. We shall come to this later. 

The norms ||i2oV||oo on overlapping regions are equivalent. For, around py we perturb 
with X such that ||itV^||oo < °o> and 

\\RvXWoo = WRvHqRqXW^ < ||iV#o||-|| (51) 

and the converse inequality holds similarly. We define the (+)-affine connection by trans- 
porting the score V — Tr pV at the point p to the score V — Tr aV at a. This connection is 
flat and torsion-free, since it patently does not depend on the path between p and a. The 
(— )-connection can be defined in M since each C p is a vector space. It is likely, but not 
proved, that the (— )-mixture of states is continuous in the topology we have defined here. 
A case between operator bounded and form bounded is e-bounded: 

||V|| e := ||i?y 2 ~V.Ro /2+e ||oo < oo, < e < 1/2. (52) 

This is the analogue of the Cramer class, since we prove that Z is an analytic function of 
V in this case. 

Araki proved that if V is bounded, the Kubo-Mori expansion converges: 

oo „i 

log Z V = J2 fa!) -1 / II **i*E <*i - l ) R n (53) 

n=0 J ° 

where 

K n :=Tr(p a W...p a "V). (54) 

We prove (with Grasselli) that the series converges also for e- bounded perturbations, and 
that the ||V|| £ are equivalent on overlapping regions. We now give an outline of the method. 

We need an economical estimate for the n-Kubo function. If V were bounded, we could 
use the Holder inequality for traces, with pi = 1/a, using that J2 a i = 1 ; 

\Tr[p^V 1 ...p a -V n }\<Trp\\V 1 \\ 00 ...\\V n \\ 00 . (55) 

We do better, since there is /3 < 1 such that pP is of trace class, so we can replace p by pr . 
We can thus borrow p^ l ~^ a ^ to help bound the potentials. Also, as a j = 1> the region of 
integration is the (overlapping) union of regions Sj where aj >l/n. By cyclicity, we may 
take j = n. We then write p aj Vj as 



P 



R^VjR 1 "^ 



(56) 



The dots are factors taken with other terms. We bound the middle [...] by the spectral 
theorem, arranging the parameters Sj so that we get an integrable function of aj in S n , 
1 < j < n — 1. We bound the final [...] using the e-boundedness of V, by a suitable choice 
of the 5j. We end up with a factorial bound on the n-point function, so the series converges 
as a geometric series. 

The manifold can be furnished by a real- analytic structure, by asserting that the ring of 
germs of analytic functions on the manifold consists of functions that are analytic in these 
analytic directions. The mixture coordinates r] are examples of analytic functions; we say 
that we have an analytic parametrisation of the manifold by rj. It remains to prove that the 
£ are analytic functions of rj, before we can say that r\ are analytic coordinates. 
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7 Singular perturbations 



Every point of our manifold has some directions in its tangent space that remain within A4 
but are not analytic directions. Consider the anharmonic oscillator, 

H={p 2 + q 2 )/2 + Xq 2n , A > 0. (57) 

It is known that exp —f3H is of trace-class for all (3 > 0, so these states are in Ai. It is also 
known that there is a singularity at A = 0. Our result shows that if we start at A > then 
there is a region around this state where the manifold has analytic directions. Obviously, 
any point in M. has many analytic directions: the bounded perturbations, provide many 
such. The metric is finite in a much wider class of directions: if p@ is of trace-class, and V 
is a form such that p s V is bounded for 5 = (1 — 0)/2, the a regularised BKM metric in the 
^-direction is finite at p. 



The natural class of states, the analogue of the Orlicz space of [21|, is the set -M max of 
states of finite entropy. The natural class of states a in a 'hood of a state p of finite entropy 
consists of states of finite entropy whose entropy relative to p is also finite. This 'hood will 
consist of many non-analytic perturbations of p. It is known that the — 1-mixture (the usual 
mixture) of states of finite entropy has finite entropy, so A4 max has the — 1-affine structure. 
Here is a simple proof. 

Theorem 4 

S(Xp + (1 - X)a) < XS(p) + (1 - A)S(ex) + A log(l/A) + (1 - A) log(l/(l - A)). (58) 
Proof. 

— log a; is an operator monotone decreasing function. Since Xp + (1 — A)er > Xp, we have 

-log(Ap + (l-A)(7) < -log(Ap). 

Hence 

-Xp. log(Ap + (1 - X)a) < -Xp. log(Ap). 

Similarly 

-(1 - A) log(Ap + (1 - X)a) < -(1 - X)a log((l - X)a). 

Adding, gives 

S{Xp + (1 - X)a) < -Xp.(Xp) - (1 - X)a. log((l - X)a) 

= XS(p) + (1 - X)S(a) + A log(l/A) + (1 - A) log(l/(l - A)) < do. 

So the space .M max of density matrices of finite entropy is a (— l)-affine space. 



In [24] we propose a Luxemburg norm for the tangent space at a point p £ A4 max . We 
expect that a 'hood of a point p will consist of all states a £ .M max having finite relative 
entropy, thus: S(a\p) := p.(logp — logo") < oo. 
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